Relaunched Hedge Funds

This program is a replication of Stata Script Version 0.2



In [1]:

    
import pandas as pd
from datetime import timedelta

# ****************** Program Settings ******************
Folder 		= ""			 # Location of program scripts
Data 		= "temp/"		 # Location to which temporary files are generated
DataSource  = "data/"		 # Location of the original data files (ASCII)
Gap_Days = 60 				 # To be quantify as a gap, the holidays that a fund manager takes must be at least 60 days.
              				 # If you would like to allow overlapping funds, change this to a negative number.
# ******************************************************

Prepare Data



In [2]:

    
# Load manager details
data_manager = pd.read_stata(DataSource + 'PeopleDetails.dta')
data_manager = data_manager[data_manager.PersonTypeID==1]
variables_to_keep = "ProductReference PersonID First Last JobTitle Address1 Address2 Address3 CityName StateName Zip CountryName".split()
data_manager = data_manager[variables_to_keep]
data_manager.head()









    Out[2]:






  
    
      
      ProductReference
      PersonID
      First
      Last
      JobTitle
      Address1
      Address2
      Address3
      CityName
      StateName
      Zip
      CountryName
    
  
  
    
      1
      5
      2508
      Robert
      Thompson
      Portfolio Manager
      3351 W Bearss Ave
      
      
      Tampa
      Florida
      33618
      United States
    
    
      4
      8
      139
      Enrico
      Massignani
      Performance Contact
      Via Ugo Bassi 6
      
      
      Milan
      None
      20159
      Italy
    
    
      6
      9
      139
      Enrico
      Massignani
      Performance Contact
      Via Ugo Bassi 6
      
      
      Milan
      None
      20159
      Italy
    
    
      9
      10
      139
      Enrico
      Massignani
      Performance Contact
      Via Ugo Bassi 6
      
      
      Milan
      None
      20159
      Italy
    
    
      11
      11
      139
      Enrico
      Massignani
      Performance Contact
      Via Ugo Bassi 6
      
      
      Milan
      None
      20159
      Italy



In [3]:

    
# Extract Inception / PerformnaceEndData date
data_dates = pd.read_stata(DataSource + 'ProductDetails.dta')
variables_to_keep = "ProductReference InceptionDate PerformanceEndDate".split()
data_dates = data_dates[variables_to_keep]
data_dates.head()









    Out[3]:






  
    
      
      ProductReference
      InceptionDate
      PerformanceEndDate
    
  
  
    
      0
      86
      1993-01-01
      2013-09-30
    
    
      1
      87
      1992-01-01
      2008-09-30
    
    
      2
      90
      1993-07-01
      1997-12-31
    
    
      3
      91
      1993-07-01
      1997-12-31
    
    
      4
      92
      1993-07-01
      2000-12-31

Analysis

Merge managers with the start/end dates



In [4]:

    
# Inner join - non-matches are excluded
data_merged = data_manager.merge(data_dates, on='ProductReference', how='inner')
data_merged = data_merged.sort(columns=['PersonID', 
                                        'PerformanceEndDate', 
                                        'InceptionDate',
                                        'ProductReference'])
data_merged.head()









    Out[4]:






  
    
      
      ProductReference
      PersonID
      First
      Last
      JobTitle
      Address1
      Address2
      Address3
      CityName
      StateName
      Zip
      CountryName
      InceptionDate
      PerformanceEndDate
    
  
  
    
      1197
      1763
      4
      Douglas
      Bry
      President
      3609 South Wadsworth Blvd.
      Suite 250
      
      Denver
      Colorado
      80235
      United States
      1996-08-01
      1998-03-31
    
    
      312
      637
      4
      Douglas
      Bry
      President
      3609 South Wadsworth Blvd.
      Suite 250
      
      Denver
      Colorado
      80235
      United States
      1991-03-01
      2011-03-31
    
    
      1120
      1679
      4
      Douglas
      Bry
      President
      3609 South Wadsworth Blvd.
      Suite 250
      
      Denver
      Colorado
      80235
      United States
      1991-03-01
      2011-03-31
    
    
      1856
      2508
      9
      Jerald
      Weintraub
      (none)
      900 Montgomery Street
      
      
      San Francisco
      California
      94133
      United States
      1992-05-08
      2006-10-31
    
    
      2314
      3721
      9
      Jerald
      Weintraub
      (none)
      900 Montgomery Street
      
      
      San Francisco
      California
      94133
      United States
      1998-11-18
      2007-05-31

Find First End Date for each PersonID. Variable first_end_date is defined as the performance ending date of the first fund of each person ID.



In [5]:

    
# Create a GroupBy object
grouped = data_merged[['PersonID', 'PerformanceEndDate']].groupby('PersonID', as_index=False, axis=0)  # these will aplit the DataFrame on its index (rows).
grouped.groups
print(grouped.get_group(4))
grouped.min().head()









    



      PersonID PerformanceEndDate
1197         4         1998-03-31
312          4         2011-03-31
1120         4         2011-03-31






    Out[5]:






  
    
      
      PersonID
      PerformanceEndDate
    
  
  
    
      0
      4
      1998-03-31
    
    
      1
      9
      2006-10-31
    
    
      2
      11
      1997-04-30
    
    
      3
      15
      1999-08-31
    
    
      4
      19
      2002-03-31



In [6]:

    
# Find the smallest value in the end date for each PersonID
transformed = grouped.min()
transformed['first_end_date'] = transformed['PerformanceEndDate']
transformed = transformed.drop('PerformanceEndDate', axis=1)
transformed.head()









    Out[6]:






  
    
      
      PersonID
      first_end_date
    
  
  
    
      0
      4
      1998-03-31
    
    
      1
      9
      2006-10-31
    
    
      2
      11
      1997-04-30
    
    
      3
      15
      1999-08-31
    
    
      4
      19
      2002-03-31



In [ ]:

Main Program (takes about 15 mins on server)

Check the pair-wise duration gaps of each funds by Person ID



In [7]:

    
# Merge back to the main dataset
data_main = data_merged.merge(transformed, how='outer', 
                                left_on='PersonID', right_on='PersonID')

data_main = data_main.sort(columns=['PersonID', 
                                    'InceptionDate', 
                                    'PerformanceEndDate', 
                                    'ProductReference'])

################################################
# WARNING: DEBUG CODE -- NEEDS TO BE DISABLED
# data_main = data_main[:2000:]
# data_main = data_main[data_main.PersonID==799]
################################################

def find_gaps(person_panel):
    '''The function finds the number of relaunched hedge funds.
    A gap is defined as there exists a fund, of which the PerformanceEndDate
    is before the inception date of all funds that were incepted later than
    this fund.
    Returns the number of gaps; and all of the Fund ID which is proceeding 
    each gap.
    '''
    gaps_number = 0
    fund_ID_preceed_gap = []
    
    # Reset index from 0
    person_panel = person_panel.reset_index(drop=True)

    # print("Her data panel is as below:")
    # print(person_panel)
    
    for i, i_row in person_panel.iterrows():
        # Reset criteria status for a new i_row
        criteria_backward = True
        criteria_forward = True
        criteria_exist_after = False
        
        for j, j_row in person_panel.iterrows():
            if i==j:
                # Skip
                continue
                
            # print('Comparison now made for row: ', i, j)

            # Days to define the gap is NOT yet incorporated.

            # Criteria 1 - Looking backward: check if i_end>j_end for all j of which j_inc<i_inc
            # For all funds earlier than i
            # j: Inc-----End
            # i:      Inc------------End
            if j_row.InceptionDate<i_row.InceptionDate:
                criteria_backward *= check_backward(i_row, j_row)

            # Criteria 2 - Looking forward: check if i_end<j_inc for all j of which j_inc>i_inc
            # For all funds later than i
            # i: Inc---------End
            #                   |***GAP***|
            # j:                           Inc--------End
            if j_row.InceptionDate>=i_row.InceptionDate:
                criteria_forward *= check_forward(i_row, j_row)

            # Criteria 3 - There must be funds incepted after fund i
            if j_row.InceptionDate>=i_row.InceptionDate:
                criteria_exist_after = True
                
        # If Criteria 1,2,3 are all satisfied
        # Fund i is the fund proceeding a gap
        if criteria_backward==True and criteria_forward==True and criteria_exist_after==True:
            fund_ID_preceed_gap.append(i_row.ProductReference)
            gaps_number += 1
        
    return (gaps_number, fund_ID_preceed_gap)
    
def check_backward(i_row, j_row):
    """
    Compares the PerformanceEndDate for i_row and j_row.
    It returns True if i_end>j_end;
    It returns False otherwise.
    """
    if i_row.PerformanceEndDate >= j_row.PerformanceEndDate:
        return True
    else:
        return False
 
def check_forward(i_row, j_row):
    """
    Compares the PerformanceEndDate for i_row to the InceptionDate of j_row.
    Returns True if i_end<j_inc;
    Returns False otherwise.
    """
    global Gap_Days # Find the global variable Gap_Days
    gap_days = timedelta(days=Gap_Days)
    
    if i_row.PerformanceEndDate + gap_days <= j_row.InceptionDate:
        return True
    else:
        return False
    
grouped_by_person = data_main.groupby('PersonID', as_index=False, axis=0) 
grouped_by_person.groups

# A new dict to store number of gaps found for each person
gaps = {}
fund_IDs = {}

for person_id, person_panel in grouped_by_person:
    # print("The person's PersonID is :", person_id)
    gaps_number, fund_ID_preceed_gap = find_gaps(person_panel[['ProductReference',
                        'InceptionDate',
                        'PerformanceEndDate']])
    gaps[person_id] = gaps_number
    fund_IDs[person_id] = fund_ID_preceed_gap



In [18]:

    
# Transform two dict into DataFrame
data_gaps = pd.DataFrame.from_dict(gaps, orient='index')
data_gaps.columns = ['number_of_gaps']
data_gaps['PersonID'] = data_gaps.index
data_gaps = data_gaps.reset_index(drop=True)
data_gaps.head()









    Out[18]:






  
    
      
      number_of_gaps
      PersonID
    
  
  
    
      2243
      2
      21960



In [20]:

    
data_gap_fund_IDs = pd.DataFrame.from_dict(fund_IDs, orient='index')
data_gap_fund_IDs.columns = ['fund_IDs_proceeding_1st_gap', 'fund_IDs_proceeding_2nd_gap']
data_gap_fund_IDs['PersonID'] = data_gap_fund_IDs.index
data_gap_fund_IDs = data_gap_fund_IDs.reset_index(drop=True)
data_gap_fund_IDs.head()









    Out[20]:






  
    
      
      fund_IDs_proceeding_1st_gap
      fund_IDs_proceeding_2nd_gap
      PersonID
    
  
  
    
      0
      NaN
      NaN
      16384
    
    
      1
      NaN
      NaN
      4
    
    
      2
      NaN
      NaN
      9
    
    
      3
      NaN
      NaN
      11
    
    
      4
      NaN
      NaN
      15



In [21]:

    
# Merge with the main dataset
data_output = data_main.merge(data_gaps, how='outer', 
                                left_on='PersonID', right_on='PersonID')
data_output = data_output.merge(data_gap_fund_IDs, how='outer', 
                                left_on='PersonID', right_on='PersonID')
data_output.head()









    Out[21]:






  
    
      
      ProductReference
      PersonID
      First
      Last
      JobTitle
      Address1
      Address2
      Address3
      CityName
      StateName
      Zip
      CountryName
      InceptionDate
      PerformanceEndDate
      first_end_date
      number_of_gaps
      fund_IDs_proceeding_1st_gap
      fund_IDs_proceeding_2nd_gap
    
  
  
    
      0
      637
      4
      Douglas
      Bry
      President
      3609 South Wadsworth Blvd.
      Suite 250
      
      Denver
      Colorado
      80235
      United States
      1991-03-01
      2011-03-31
      1998-03-31
      NaN
      NaN
      NaN
    
    
      1
      1679
      4
      Douglas
      Bry
      President
      3609 South Wadsworth Blvd.
      Suite 250
      
      Denver
      Colorado
      80235
      United States
      1991-03-01
      2011-03-31
      1998-03-31
      NaN
      NaN
      NaN
    
    
      2
      1763
      4
      Douglas
      Bry
      President
      3609 South Wadsworth Blvd.
      Suite 250
      
      Denver
      Colorado
      80235
      United States
      1996-08-01
      1998-03-31
      1998-03-31
      NaN
      NaN
      NaN
    
    
      3
      2508
      9
      Jerald
      Weintraub
      (none)
      900 Montgomery Street
      
      
      San Francisco
      California
      94133
      United States
      1992-05-08
      2006-10-31
      2006-10-31
      NaN
      NaN
      NaN
    
    
      4
      3721
      9
      Jerald
      Weintraub
      (none)
      900 Montgomery Street
      
      
      San Francisco
      California
      94133
      United States
      1998-11-18
      2007-05-31
      2006-10-31
      NaN
      NaN
      NaN



In [22]:

    
# Output datafiles
data_output.to_excel('data_output.xlsx')
data_output.to_stata('data_output.dta', convert_dates={13:'td', 14:'td', 15:'td'})

	ProductReference	PersonID	First	Last	JobTitle	Address1	CityName	StateName	Zip	CountryName
1	5	2508	Robert	Thompson	Portfolio Manager	3351 W Bearss Ave	Tampa	Florida	33618	United States
4	8	139	Enrico	Massignani	Performance Contact	Via Ugo Bassi 6	Milan	None	20159	Italy
6	9	139	Enrico	Massignani	Performance Contact	Via Ugo Bassi 6	Milan	None	20159	Italy
9	10	139	Enrico	Massignani	Performance Contact	Via Ugo Bassi 6	Milan	None	20159	Italy
11	11	139	Enrico	Massignani	Performance Contact	Via Ugo Bassi 6	Milan	None	20159	Italy

	ProductReference	InceptionDate	PerformanceEndDate
0	86	1993-01-01	2013-09-30
1	87	1992-01-01	2008-09-30
2	90	1993-07-01	1997-12-31
3	91	1993-07-01	1997-12-31
4	92	1993-07-01	2000-12-31

	ProductReference	PersonID	First	Last	JobTitle	Address1	Address2	CityName	StateName	Zip	CountryName	InceptionDate	PerformanceEndDate
1197	1763	4	Douglas	Bry	President	3609 South Wadsworth Blvd.	Suite 250	Denver	Colorado	80235	United States	1996-08-01	1998-03-31
312	637	4	Douglas	Bry	President	3609 South Wadsworth Blvd.	Suite 250	Denver	Colorado	80235	United States	1991-03-01	2011-03-31
1120	1679	4	Douglas	Bry	President	3609 South Wadsworth Blvd.	Suite 250	Denver	Colorado	80235	United States	1991-03-01	2011-03-31
1856	2508	9	Jerald	Weintraub	(none)	900 Montgomery Street		San Francisco	California	94133	United States	1992-05-08	2006-10-31
2314	3721	9	Jerald	Weintraub	(none)	900 Montgomery Street		San Francisco	California	94133	United States	1998-11-18	2007-05-31

	PersonID	PerformanceEndDate
0	4	1998-03-31
1	9	2006-10-31
2	11	1997-04-30
3	15	1999-08-31
4	19	2002-03-31